Privacy-Preserving Data Mining in Electronic Surveys

نویسندگان

  • Justin Zhijun Zhan
  • Stan Matwin
چکیده

Electronic surveys are an important resource in data mining. However, how to protect respondents’ data privacy during the survey is a challenge to the security and privacy community. In this paper, we develop a scheme to solve the problem of privacy-preserving data mining in electronic surveys. We propose a randomized response technique to collect the data from the respondents. We then demonstrate how to perform data mining computations on randomized data. Specifically, we apply our scheme to build a Naive Bayesian classifier from randomized data. Our experimental results indicate that accuracy of classification in our scheme, when private data is protected by randomization, is close to the accuracy of a classifier build from the same data with the total disclosure of private information. Finally, we develop a measure to quantify privacy achieved by our proposed scheme.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Opportunities and Challenges for Privacy-Preserving Visualization of Electronic Health Record Data

In this paper, we reflect on the use of visualization techniques for analyzing electronic health record data with privacy concerns. Privacy-preserving data visualization is a relatively new area of research compared to the more established research areas of privacy-preserving data publishing and data mining. We describe the opportunities and challenges for privacy-preserving visualization of el...

متن کامل

Privacy Preserving Data Mining in Electronic Health Record using K- anonymity and Decision Tree

In this paper, we present an accurate and efficient privacy preserving data mining technique in Electronic Health Record (EHR) by using k –anonymity and decision tree C4.5 that is useful to generate pattern for medical research or any clinical trials. It is analyzed that anonymization offers better privacy rather than other privacy preserving method like that randomization, cryptography, pertur...

متن کامل

Privacy Preserving Data Mining: An Extensive Survey

— Proper integration of privacy into data mining operations is so crucial for the wide-spread acceptance of knowledge-based applications. The data mining applications involve data rich environments such as Biomedicine in terms of electronic health records, Internet in terms of web usage logs and Wireless Networks in terms of mobility data from sensors and a lot more. Privacy in data mining can ...

متن کامل

Privacy-Preserving Classification and Clustering Using Secure Multi-Party Computation

Nowadays, data mining and machine learning techniques are widely used in electronic applications in different areas such as e-government, e-health, e-business, and so on. One major and very crucial issue in these type of systems, which are normally distributed among two or more parties and are dealing with sensitive data, is preserving the privacy of individual’s sensitive information. Each par...

متن کامل

Classification via Clustering for Anonym zed Data

Due to the exponential growth of hardware technology particularly in the field of electronic data storage media and processing such data, has raised serious issues related in order to protect the individual privacy like ethical, philosophical and legal. Data mining techniques are employed to ensure the privacy. Privacy Preserving Data Mining (PPDM) techniques aim at protecting the sensitive dat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • I. J. Network Security

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2004